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With z. a signal process, w. a Brownian motion, and y t = Jo* z s ds + w t 
a noisy observation, the innovations problem is to determine whether y. is 
adapted to the innovations process v., which is also a Brownian mo- 
tion, and is defined using the estimate z\ = E{z t \y a , ^ s ^ t} by 
yt = Jo 1 z,ds + v t . The closely related a-algebras problem in stochastic 
DEs is to determine, for a given causal drift a, when a solution 
of d£ = a(t, £)dt + dw is a causal functional of w.. Previous results on 
these problems are reviewed and extended. In particular, we broach and 
answer positively the physically important case of the innovations problem 
in which the signal satisfies a stochastic DE with drift depending in part 
on the noisy observations. This case is important because it models a 
system observed through noise and controlled by feedback of these noisy 
observations. The last part of the paper shows that the innovations problem 
has a positive resolution if and only if on some probability space there is 
a Brownian motion W and a causal solution £ of d£ = a(t, £)dt + dW, 
where a expresses the estimator z; that is, a is a causal functional such 
that z\ = a(t, y). 

I. INTRODUCTION 

Estimation of signals from past observations of them corrupted by 
noise is a classical problem of filtering theory. The following is a 
standard mathematical idealization of this problem: The signal z. is a 
measurable stochastic process with E\z t \ < °°, the noise w. is a 
Brownian motion, and the observations consist of the process 



y 



t = I z.ds + w t . (1) 

Define % t = E{z t \y„0 ^ s ^ t), the expected value of z t given the 
past of the observations up to t. It can be shown 1 that if Jo* z 2 s ds < oo 
a.s., then there is a measurable version of z. with Jo* z 2 ,ds < °o a.s. 
The innovations process for this setup is defined to be 

v t = (z, — £,)ds + w t , 
Jo 

981 



and it is a basic result of Frost 2 and also of Kailath 3 that, under weak 
conditions, v, is itself a Wiener process with respect to the observations. 
Thus, (1) is equivalent to the integral equation 



lit 



= C z.ds + n, (2) 

Jo 



which reduces the general case (1) to that in which z. is adapted to y., 
a special property useful in questions of absolute continuity in filtering 
and detection. 

Since z\ is of necessity adapted to y., Eq. (2) purports to define y. 
in terms of v. ; the innovations problem, first posed by Frost, 2 is precisely 
to determine whether it really does. Frost asked : Do the innovations 
contain all the information in the observations? rjBy (2) they do not 
contain more.] In the language of probability this is to ask whether 
the <r-algebras that the processes generate are the same up to null 
sets; i.e., is 

% 4 Ay., s^t) = <r{v„s^t) 4 Efl& (mod P)1 

II. THE CT-ALGEBRAS PROBLEM IN STOCHASTIC DEs 

The innovations problem is equivalent to an apparently more 
general problem from the theory of stochastic des, sometimes called 
the tr-algebras problem : Given a causal drift a(s, x), possibly depending 
on the past of the function x, and a weak solution of the de dx t 
= a{t, x)ds + dw t , with w. a Brownian motion, and x. possibly 
nonanticipating with respect to dw., to determine whether 

<r{x t , s ^ t) = <t{w„ s ^ t) (mod P). 

Positive answers to both problems were widely conjectured. 

The innovations problem has been outstanding, in both senses of 
the word, since about 1968, and it has drawn the attention of communi- 
cations theorists and probabilists alike. The a-algebras problem has 
been current in the Soviet Union since the late 1950s; there it has been 
the object of great effort and a source of stimulus far in excess of its 
simple origins. 4 Accounts of the innovations problem and its theoretical 
background are in lecture notes by Meyer 1 and in a paper by Orey. 6 

It is now known that the answer to the general problem is in the 
negative. B. Cirel'son has given a counterexample 6 - 17 for the following 
special case (this case shows, incidentally, that the innovations and 
(r-algebras problems are in fact the same) : Suppose that the signal z. 
is a causal functional a(t,y) of the observations; i.e., the signal is 
entirely determined by feedback from the observations. Then z = i, 
w = v, and the problem reduces to asking whether the observations 
are "well-defined" in the strong sense of being adapted to the noise; 
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for in this vestigial or degenerate case, the noise is the only process 
left. Cirel'son's disturbing example consists of a choice of a(-, •) for 
which there is just one weak solution y., which is nonanticipative in 
that the future increments of w. are independent of the past of y., 
but which cannot be expressed as a functional of w., causal or not, 
over any interval. 

Prior to this counterexample, several cases of the problems had 
been settled in the affirmative. J. M. C. Clark 7 proved that if noise 
and signal are independent and the signal is bounded (uniformly in t 
and 03), then observations are adapted to innovations. The author 8 
extended Clark's method and result to the case where signal and noise 
are independent and the signal is almost surely (a.s.) square-integrable. 
The case of gaussian observations turns out affirmatively : here results 
of Hitsuda 9 imply that £ . is a linear functional of the past of y. and 
eq. (2) is solvable by a Neumann series. Zvonkin 10 has given an affirma- 
tive answer to the o--algebras problem for the Markov case a(s, y) 
= a{y,) bounded and homogeneous in time, using the associated scale 
function to transform the state space; this result extends to time- 
dependent bounded a(s, y,) satisfying Dini's condition. 

It should be remarked that although the innovations and <r-algebras 
problems are mathematically equivalent, they arise in different 
contexts, involve different emphases, and can be usefully contrasted, 
as discussed below. 

The innovations problem arose in filtering theory, and it focuses 
especially on the nature of the filter or operator that gives t t as a 
causal functional a(t, y) on the past of y.\ from this point of view, 
the example of Cirel'son, in which there is no real filtering going on, 
is a bit wide of the mark ; the real problem is to find out enough about 
the filter to be able to settle whether a { y } = a \ v \ in cases where 
there is a real signal (determined in part by sources other than the 
noise, and in part possibly by control or feedback based on the obser- 
vations), which it is desired to control, transmit, filter, or detect. 

The o--algebras problem arises in stochastic functional des, and is 
therefore more general in scope, since the drift functionals considered 
need no longer be filters or conditional expectations like z t ; the em- 
phasis is on dynamics, causality, and nonanticipation, with no ad- 
mixture of estimation. If the drift functional to be considered is a 
filter, it may have special properties that are useful in the investi- 
gation. (See the method of Clark. 7 ) 

III. SUMMARY 

Relevant notions from stochastic des are defined in Section IV: 
causal functionals, weak solutions, causal solutions, and nonanticipa- 
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tive solutions. A n.a.s. martingale-type condition for validity of the 
innovations conjecture is given in Section V; as an application, this 
condition yields (Section VI) the (known) conjecture for the case of 
gaussian observations. Section VIII describes some of Zvonkin's posi- 
tive results for the Markov case and an extension. In Section VIII, we 
investigate the problem of calculating the estimate z. and give various 
relationships based on absolute continuity of measures. Section IX 
is devoted to the physically important case of signals z. that solve 
stochastic des with drift based on feedback of observations; we show 
that if this drift is Lip in the feedback of linear growth in the signal 
uniformly in the observations, then observations are adapted to innova- 
tions. In Sections X and XI, finally, we show that validity of the in- 
novations conjecture is equivalent to the causal solvability, on some 
probability space, of the equation d£ = a(t, £)dt + dW, with W 
Brownian and a the (a?) functional, such that 2 t = a(t, y). 

IV. CAUSAL SOLUTIONS OF STOCHASTIC DEs 

A measurable functional 7 : [0, °o ) X C[0, °° ) — > R is called causal 
if for each iE[0, °°), x, = y, for s ^ t implies 

7(s, x) = y(s, y) x,yG C[0, »). 

The idea expressed by this definition is the physical one that y(t, •) 
cannot depend functionally on any more than the past of its argument 
up to t; thus, it has the same value at t for two functions that agree 
for 8 ^ t. In spite of the presence of the word "depend" in the previous 
sentence, causality of a functional is expressible as a measurability 
property, and has no immediate relation to any probability measure. 
Let a be a causal functional. A weak solution of the stochastic de, 

dx = a(t, x)dt + dW, W. Brownian, (3) 

is a process £. such that 

(n)i - fc- [ t a(a,t)d8= vt 

Jo 

is a Brownian motion on its own past. If £ . is adapted to v. ; i.e., if for 
each t, fa is measurable with respect to <r{v t , s ^ t}, then £. is called 
a causal solution, and there is a causal functional tp such that £« 
= <p(t, v) at each t with probability one. A solution £. is called non- 
anticipative if (roughly) the future increments of v. are independent of 
the past of £.; i.e., for each t 

v{vu — vt,u^ t\ J[<r{£„ s ^ t\. 

This is a probabilistic property, and it is equivalent to v's being a 
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martingale on the larger algebras <t{£„s ^ t\. It can be seen that for 
(3), causal =» nonanticipative, but the converse is false: it is known 
that there are drift functionals a for which there exist nonanticipative 
weak solutions, but no causal solutions. 17 

V. A SEMI-MARTINGALE CONDITION 

Theorem 1 : a{v. t s ^ t] = v{y„ s ^ t\ (mod P) for each t ^ iff there 
is a ^-adapted martingale x t and a causal functional \f/: [0, ») 
X C[0, oo ) — »• i? such that the observations are representable (as a semi- 
martingale on their own past) by 



Vt - x t + / M*, x)ds. 
Jo 



Proof: The hypothesis on y. and the equation dy = idt + dv imply 
that 

x t - v t = f [z. - Ms, x)~\ds. (4) 

Jo 

The innovations process is a Wiener process with respect to the 
observations ; thus, the left side of (4) is a continuous ^-martingale. 
Since the right side is absolutely continuous in t, it follows that both 
sides vanish identically, so that x. and v. are indistinguishable pro- 
cesses, and 

/ 2.ds = I rp(s,x)ds = J \l/(s, v)ds. 
Jo Jo Jo 

But the right-hand side is ^-adapted because ^ is causal. The theorem 
follows from dy = idt + dv. For the converse, we argue thus : if the 
ir-algebras coincide, there is a causal functional <p such that y t = <p(t, v). 
The innovations theorem makes v, a ^-nmrtingale with 

yt = vt + I 2 °<pds; 
then, take x = v and \f/ = Z °y. 

VI. APPLICATION TO GAUSSIAN OBSERVATIONS 

The theorem just proved affords us a simple demonstration of the 
validity of the innovations conjecture for gaussian observations. 
Suppose that the signal z. is square-integrable almost surely. Then, a 
theorem of Kailath and Zakai 11 implies that the measure induced by y. 
is absolutely continuous with respect to Wiener measure. The class of 
gaussian processes absolutely continuous with respect to Wiener 
measure has been characterized in a causal way by Hitsuda 9 : a process 
y. belongs to this class iff there is a Wiener process W . adapted to y. 
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and a Volterra kernel m(-, •) £ L 2 [0, lj 2 such that 

yt = W t + /" /"' w(s, w)d^ u ds. 

Thus, W. is a martingale on the past of y. and in Theorem 1 we can set 
x = W, ip(s, W) = Jo' m (s, u)dW „, to conclude that v is W and % is yf/. 
Iteration of the relation between y and W gives 

ft = yt~ I (I m(s, u)dy u — / m(s, u) 

■ I m(u, v)dudy v + • • • jds, 
so that letting 

l(t, s) = m(t, s) — I m(t, u)m{u, s)du + • • • 
Jo 

be the Neumann series or the resolvent of m (• , •), we see that 

Vt = v t + I I l(s, u)dy u ds 
Jo Jo 

U ™ / l(t, u)dy u . 

Thus, the map £ is linear in y. when y. is gaussian, as was expected. 

VII. RESULTS OF ZVONKIN FOR THE MARKOV CASE 

For a stochastic de of the form dy = a{y t )dt + dw t , Zvonkin 10 has 
shown that if a(-) is bounded, then there is a causal solution y.. His 
procedure 12 is to look at the scale function 

u(y) = I exp - 2 / a(s)dsdz = / 0(z)dz (5) 

Jo Jo Jo 

and to note that a{y s , s ^ t\ = o-[u(y,), s ^ t) because w(-) is mono- 
tone. Then to show that u(y 8 ) can be got causally from w., he uses 
Ito's rule on z t = u{y t ) to get 

dz t = 0(yt)dy t - (3(y t )a(y t )dwt 
= ftu-'iztftdwt. 

By calculus he finds a(-) bounded => (w -1 ) £ Lip; hence, z. is a 
causal functional of w. and so is y.. 

Now this argument depends in part on the fact that u satisfies 
\u" + via = 0, and at once suggests extensions to the inhomogeneous 
case a(t, y) = a(t, y t ). We give an example based on Zvonkin's paper: 10 
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Theorem 2: Suppose that for some strictly increasing function k(-) there 
exists a solution u(t, y) of the Cauchy problem 

u(0,y) = k{y) 
ui + £w 2 2 + a(t, y)u 2 = £u = 0+ 

such that (log 7/2)2 is bounded. Then, the stochastic de 

dy t = a(t, y t )dt + dw t (6) 

has a causal solution. 

Proof: Since &f, it is clear that u v >0 and that the transformation 
z t = u(t, y t ) is bijective. Using Ito's rule, we calculate the stochastic 
differential of z. as 

dz t = (£u)(t, y t )dt + Ui(t, y t )dw t 

= Uz[t,u- l (t,z t )2dwt. (7) 

Now 

d r , w, x-, u<n[t,u- l {t,z)'] 

= (log ^2)2 1 y-u-'c*,*) bounded. 

Hence, by the usual Ito's theory of stochastic des, the martingale 
eq. (7) has a unique causal solution 2. which is a bijection of y. point- 
wise in time. Hence, y. is a causal functional of w. too. The homogeneous 
case follows if we take u{t, y) = the scale function (5). 
In a similar vein we can show this result : 

Theorem 8: If a( • , • ) is bounded and such that 

lif o a(t,z)dz 

exists and is bounded, then (6) has a causal solution. 
Proof: Let 

u(t, y) = I exp — 2 / <*(*> x)dxdz 

so that u 2 > 0, and z t = u(t, y t ) is bijective. We have 

dz t = ui[t, u-^t, z t )~]dt + u 2 \_t, u~ l (t, z t )~\dw t . (8) 

By calculus obtain formally 

|-ui[*, u" 1 (*,*)] = -2a(t,z) 
az 

j- z u,[t, vrHt, z)~\ = -2 j t r a(t, x)dx, 



* Numerical indexes show which variable is differentiated and how many times. 
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both bounded by hypothesis. Hence, (8), and so (6), has a causal 
solution, by Ito's theory. 

Note that Zvonkin's Markov case is not relevant to Kailath's 
innovations problem, because the filter giving z t will almost invariably 
depend with advantage on the whole past of the observation process. 

VIII. CALCULATION OF THE ESTIMATOR i 

Let (fi, B, P) be a probability space on which are defined the signal 
process z., the noise w., the estimate z., and the innovations process v., 
related by 



Vt 



= I z,ds + w t = / z a ds + v t . 



Under some mild technical assumptions, this setup has implicit in it a 
rich structure that allows us to give a "formula" for 6, inter alia. To 
penetrate deeper into the situation, it is convenient to introduce an 
absolutely continuous change of measure which makes the observation 
process Brownian. We shall restrict attention to the interval ^ t ^ 1, 
and assume that 



Jo 



zlds < co a.s. 



(ii) There is a system of increasing o--algebras 7 *, ^ t ^ 1, to 
which z. and w, are adapted, and w. is a Wiener process with respect 
to (P, $.). 



(ii) 



E exp | — / z s dw 8 — -x I z 2 s ds 



= 1. 



Then by Girsanov's theorem, 13 the observation process y. is a 
Wiener process on 9\ (and thus on t ^o = tr{y„8^ •}) under the 
transformed measure P defined on 34 by 

It is convenient to use the functional notation 5 

q(f, g)t = exp j o f a dg* - ^ J q fids • 

Then (dP /dP)- 1 = q(z,y) 1 > a.s. and for A E Sh 

P(A) = j A q(z, y)^ dP, ^ = q(z, y), 
so that P <$C Pa and so P~ P . The following formula for 2 is then 
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readily justified: with % = a{y s ,0 ^ s ^ t], 

E \z t q(z,y) 1 \%} 
Eo{q(z,v)i\%] " 



z t = 



For let A G %, so that 

j A Z t dP = f ^jp-*** = J A Eo{z t q(z, y)i\%}dP t 

E {q(z,y) 1 \%}dP 



f E {z l q(z,y) 1 \y } 
" J A E {q(z,y) 1 \%) 



-L 



E {z l q(z,y) l \%\ 



dP, 



I a Eo{q(z,y)i\%} 
since for A £ 'ilo and a t yo _measura bl e function 91 integrable 
dP 



(9) 



I 



A U,"0 



-L 



«jg|«l-//v 



Thus, the ratio (integrand) in eq. (9) is a version of z t . The process 

dP 



is a positive martingale on the past of the Brownian (under P ) motion 
y. and can therefore be expected to have a special form. It can in fact 
be shown by arguments of Shiryaev and Liptser 14 that 

dP 



E 



dP 



%\ = q- l (z,y) t . 



From this it follows that 



£ ° ( £ i*| - «* y) - 



It is obvious intuitively that q(z, y)i in eq. (9) can be changed to 
q(z, y)i- since q(z, y). is an 5\-martingale, we have a.s. 



7 ^ gp;flr*^»y)«l ff| 



1 



= q(z, y)t 



dP 



E {zt^- |ff<[ - En[z t q(z, y)t\9t). 



Hence, 'i/o ^ £« gives 



/ ^§ o \% 



.1- 



^o{z«g(z,y)i|«yMj a.s. 
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Thus, the filtering z. can be represented as 

l _ ^o{g«g(g,y)ilU} 

= dM(y, g(S, y))< 

The last equation, it can be verified, is an identity valid for any 'JJo- 
adapted, a.s. square-integrable functional, not just z. Thus, the meat 
of the formula is the numerator, as could be expected intuitively since 
the denominator is basically a normalizer. 

IX. SIGNALS SOLVING ITO DEs WITH DRIFT BASED ON FEEDBACK 
OF OBSERVATIONS 

The positive results of Clark, 7 based on the assumption of indepen- 
dence between signal and noise, seem adequate for many practical 
purposes of one-way communication or detection. For the more 
general physical applications to estimation and control, involving 
feedback of observations to control the signal, it would be pleasant to 
be able to weaken this assumption and allow some physically reason- 
able dependence between signal and noise. A natural setup to investi- 
gate is a generalization of the usual Kalman filter situation, in which 
the signal and the observation each solves a stochastic de, with 
independent driving white noises, and with the drift for the signal 
equation depending on the observations. Thus, we let the signal 2. 
and observation y. respectively solve 

dz t = b(t,z,y)dt + dW t (10) 

dy t = z t dt + dw t . (11) 

The second equation is, of course, eq. (1) differentiated; the functional 
6, causal in both 2 and y simultaneously, represents the deterministic 
dynamics of a system described by 2. and depending on the past of 
both signal and observation. 

As noted at the end of Section I, the only way to make headway is to 
find out something about the form of the filter that gives z\ ; we shall 
show that in the case of eqs. (10) and (11) an analog of the Kallianpur- 
Striebel 15 formula for 2 provides enough structure on which to hang a 
proof similar to Clark's. 7 

Let us assume, as is physically reasonable, that b (t, 2, y) grows at 
most linearly with sup g»gi|2 f | , uniformly in y. Then 

EgZb(W,w),W^q(W ) w) l = 1, 
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and we can "solve" (10) and (11) by Girsanov's theorem in such a way 
that the joint solution process (z t , y t ) is absolutely continuous with 
respect to the two-dimensional Brownian motion, here (W t , w t ), with 
derivative q[b(W, w), Wliq(W, w)i. It is then easily seen that £ t 
should have the form 

d fM{dW)glb{W, y), W~\ t q(W, y) t W t . (u) 

*'" fM(dW)q[b(W,y),W^ t q(W,y) t K ' Vh 

where M is the Wiener measure for W alone. Verification is left to the 
reader ; either of two more or less equivalent methods will do : direct 
integration over % sets using the absolute continuity or introduction 
of Pi by dP x = q(-z, w)q\_-b(z, y), W^dP, and a use of it similar 
to that of P in Section VIII. Note that Pi makes (z, y) a 2-dimensional 
Brownian motion. 

Theorem 4:Ifz. and y. are nonantici-pating solutions ofeqs. (10) and (11), 
and there exists a constant K such that for x, £, and rj E C[0, °° ) 

\b(t, Xi, v)\ =§#[1+ sup |*i(«)|], (13) 

\b(t, x, r,) - b(t, x,a)\ ^K sup \y - £| 

then a{y„ s^t) = <r{v„ s ^ t} (mod P). 

Remark: Eq. (13) =7^ (10) and (11) have a unique causal solution. 
In fact, our argument will not devolve on whether eqs. (10) and 
(11) have a strong solution at all; the unique (in law) non- 
anticipative solution is the Girsanov solution, with derivative 
q[b(W, w), W]q(W, iv), which determines & via eq. (12). 

Proof: With the explicit form eq. (12) for i available, a form of argu- 
ment previously used by the author 8 (and generalized from that of 
Clark 7 ) can be used: we exhibit a sequence of v.-adapted processes 
converging to t.\ the result then follows from eq. (2) : Let 

S(m,t) = { sup \W.\ ^ m], m = 1, 2, •■■. 

It can be seen that the approximations 

f M(dW)q[b(W, y), W] t q(W, y) t W t 

Zm (t) = ±*f^ = Uy) t , (14) 

f M(dW)q[b(W, y), W^ t q(W, y) t 

J S{m,t) 

approach z t as m — >°° , and that each one is adapted to y. ; therefore, it 
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is enough to prove that each one is adapted to v. itself. Now set 
&T'° = 0, m = 1, 2, • • ■ 

2/r ,B = f K' n ds + v t , m A n ^ 1 

&T- n+1 = 2 m (y m - n ) t , 

V? = f 2 m (s)ds + v t , 
Jo 

and note that d(y m - n — y m ) = {z m - n — z m )cft and 

q(f, y m - n )t = q(f, y m ) exp f /.(*■•■ - ^ m ) 8 ds. 

yo 

With ^ m " - l m = ^ n,n for short, we find from eq. (14) that 

Jsim.t) AT{y m - n ) 

qD>(W,ir), W] t q(W,ir) t 



A?{y m ) 



where 



A?U)= f M(dw)qlb( W) f),wl t q(w,f) t . 



Subtracting the two fractions on the right of \pf' n+l above, and using 
the further abbreviation p(f, g) t for q[b(f, g),f] t , we find 

W +1 = f M(dW)M(dw)w t 

J S(m,0' 

p(w, y m ) t q(w, y m ) t p(W, y m - n ) t q(W, y m > n ) t 

- p(w, y m - n ) t g(w, y m - n ) t p(W, y m ) t g(W, y m ) t 

AT{y m ' n )AT{y m ) 

The numerator in the integrand is just 

p(w, y m ) t q(w, y n ) t p{W, y m ) t q(W, y m ) t 

•Texp V [b{W, y m - n ) - b(W, y m )ldw 

- ^ J* [b 2 (W, y».») - b 2 (W, y m )lds + V W^' n da • 

— exp I / [b(w, y m ' n ) — b(w, y m )~]dw — ■= j [b 2 (w, y n ' n ) 



992 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1976 



so we can use the inequality | e A - e B | ^ \ (e A + e B ) \ A - B | to find 
that 

l^m,n+l| ^M. f M(dW)M(dw) 

p(w, y m ) t q(w, y m ) t p(W, y m - n ) t q(W, y m - n ) t 

+ p(w, y m - n ) t g{w, y n - n ) t p(W, y m )g{W, y n ) t 

AT{y m ' n )AT{y m ) 

• f [b{W, y m <») - b(W, y m )ldW 
| Jo 

-H' [b 2 (W, y" ») - b\W, y m )-]ds + /" fftfTUs 
J Jo ■/ ° 

- J [b(w, y m ' n ) - b(w, y m )~]dw 

+ if R> 2 (w> Z/ m,n ) ~ & 2 (w, Z/ m )] ds - /" w,W n ds ■ 
2 yo ■'0 

The Lipschitz and growth conditions on b imply that on the range of 
integration, with x = W or w 

\b 2 (x, «*••») - b 2 (x, y m )\ g 2Z 2 (1 + Wt) sup [*i%*dT 

g 2X 2 (1 +m)f W n \du, 
where we have used d(?/ m - n - y m ) = yp m - n dt. Hence, with 

6 mn (W) = f [&(TT f 2/ m " n ) - b(W, y m ndW, 
Jo 

we find 

|^m.n+i| ^ ??! 2 /"' \w\ds + 2# 2 (1 + m) f f \W n \duds 

■ p(ir,y m )<g(^y m )* l 
t " apod J 

Since m * is a stochastic integral, Schwarz's inequality implies that 
f M(dW)6 mn (W)p(W, y m ' n ) t q(W, y™) t 

J S(.m,t) 

• [ /" M(dTF)p'(W, y*> n ) t tf(W, y m - n )A ■ (15) 

LJftCm.O J 
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To bound the second factor on the right, we use 
q 2 (f,g)t = q(2f,g) t exp C \f\Ha 

Jo 
and the relations 

y? = f z m ds + v, 
Jo 

\ % WAv = W t p t - f p.dW 

Jo Jo 

to find that on S(m, t) 
P 2 (W,y m ) t q*(W,y m ) t 

= q&HW, y m ), W] t q(2W, y m ) t exp J V b*(W, y m )ds + V W\ds 

= q[2b(W, |T) - 2v, Wit exp J V [b{W, y m ). - Zv.Jds 

+ 2v t W t + 2 f* W,2?ds + J* W 2 \ 

^ q[2b{W, y m ) - 2v, W] t exp hmH + 2m\v t \ 

Since the q factor on the right integrates to 1 with respect to M(dW), 
it can be seen that the square root of 

f M(dW)p 2 (W,y m ) l q i (W,y m ) l 

J S(.m.t) 

is bounded by a £-integrable function depending on m and v. Since 
\t mn \ ^ m, the same result holds with y mn for y m in p and q. Also, by 
Jensen's inequality, with S = S(m, t) 

A?{y m ) = f M{dW)q[b{W, jr), Wl t q(W, y-)i 
J s 

£ M{S\ exp \m- 1 [S}J \ j l b{W, y m )dW - i [' b 2 (W,y m )d8 

£ M {£} exp I-*2S?(1 + m)H-$mH - m\v t \ 

■ f M(dW)J^ [b(W, y m ), - v .]dW. 
If T = inf s: | FF«| = m, the integral in the exponent is bounded by 
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the square root of 

f M(dW)J t X T> ,[b(W,y m ) - vjds ^ / o '[*(l +») + \v.\Jds. 

The same argument applies to A?{y m ' n ), since |z m,n | ^ to. 
It follows that 

|^m,n+l| ^ m S /"' |^™.n|ds + 2K 2 (1 + Wl) /" f \W' n \duds 

+ F(«, m, v)\j* (J^ iWlduJds^ , 

where F is a f-integrable function depending only on m and v. Thus, 
by arguments similar to those for Gronwall's inequality, it is seen that 
#"'" converge to zero. It follows that z m . are ^.-adapted, and so is y. 
by eq. (2). 

Remark: The reader is invited to speculate on how the above proof 
would be carried out if it were postulated that the dependence of 
b(t, z, y) on y in eq. (10) came only through the estimate 2, as, for 
example, b(t, z, y) = (J(t, z, z). In this case, there is no longer a formula 
for &, but only a functional equation. 

X. DISCUSSION OF THE GENERAL PROBLEM 

There is a general result of measure theory to the effect that a 
function x is measurable on the tr-algebra induced by another function 
y, iff it is representable by an explicit composition with y, that is, as a 
function of y: x = <p°y. This might be called the "explicit" function 
theorem, as opposed to the "implicit" function theorems, like Filippov's 
lemma. For here x and y are given and ip is to be found, while in 
Filippov's lemma x and <p are given and y is to be found. We suggest 
that the c-algebras and innovations problems are very close in spirit 
to the ideas around the explicit function theorem. This suggestion only 
provides what we think is a hilfsaussichtspunkt; without more informa- 
tion (about £ or a(-, •), and more insight and work, it does not help 
settle any particular case. What it helps do, though, is place the problems 
and concepts into the general framework of stochastic equations, 
especially into the circle of ideas developed by M. P. Yershov. 10 See 
also Ref. 18. 

Our final results on the innovations problem will clarify the role of 
the integral equation relating y. and v.. Thus, for each t, z t is measurable 
with respect to a{y 8 , s ^ t} ; hence, there is a causal functional a such 
that it = a(t, y), or more precisely, such that some version of 2 t 
is indistinguishable from a(t, y) ; then the relation between y. and v. is 
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essentially 

y t = I «(«, y)ds + v t . (16) 

Jo 

Thus, it is apparent intuitively, and can be proved, that if y. is adapted 
to v., then there is a causal solution to (4), namely y. itself, expressible 
as <?(-, v) with <p causal. What we shall show is the converse, that 
causal solvability somewhere of the stochastic de (16) implies a 
positive answer to the Frost-Kailath conjecture. In particular, we shall 
prove that <r{y„ s ^ t) = <r{ v a , s ^ t) (mod P) for each t iff on some 
probability space there is a Brownian motion W. and a causal solution 
fof 

fc = j\{a } £)ds + TF ( , (17) 

which induces the same measure as y. does. 

This result gives a necessary and sufficient condition for Frost and 
Kailath's innovations conjecture to hold, and it embodies the sense 
in which the innovations problem resembles the explicit function 
theorem. The direct part or necessity is obvious. For the sufficiency, 
we argue that if eq. (17) has a causal solution on some probability 
space, then it is expressible as a causal functional <p of a Brownian 
motion denned there. This functional can be "exported"; i.e., it can 
be applied to any other Brownian motion on any other space to give 
a causal solution. In particular, applying it to the innovations process 
v. gives a causal solution (<pv) t = <p(t, v), which under weak conditions 
induces the same measure as y. does. This, along with the properties 

(<pv) t — I a(s, <pv)ds & (T(pv) t = v h a.s. (18) 

*{"*, s ^ t] Qa{y t ,s g *}, (19) 

allows us to prove the basic property that for any integrable causal 
functional /3(i, y) 

E{0(t,y)\i> a ,s ^ t) = 0(t, <pv) a.s. 

This result can be applied in several ways to give the desired final 
result that y. and <pu. are modifications of each other. We describe 
two — one a digressive application of martingales and the other short 
and direct. 

XI. MARTINGALE ARGUMENTS USING P„ 

In this section, we use some of the properties of the measure P 
defined in Section VIII. We assume that a is a causal functional such 
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that it = a(t, y); i.e., really such that & t and a(t, y) are indistinguish- 
able processes. We let Q t be the Borel <r-subalgebra of C[0, ») gener- 
ated by sets of the form {x:x, E B\, s ^ t, B Borel, and 9l{, is 
<r{v l} ^ s ^ 2}, the <r-algebra generated by the innovations. 

Theorem 5: If there is, on some probability space, a Brownian motion W 
and a causal solution £ of 

d£ = <x(t, £)ds + W t 

given by a causal functional <p as £ ( = <p(t, W), and if yv and y are 
identical in law, then they are modifications of each other, and 9l(, = 'ijo 
(mod P). 

The proof is the sequence of lemmas which follow. 

Lemma 1:E{ (dP /dP)\VL l } = q-'iaipv, <pi>) t . 

Proof: Let A E 3lo> so that by the integral equation, A differs from 
a set of the form { Ty E B) , B E Qt, by at most a null set. Then, 

= I q~ l (a<pv, <pv) t dP = f q~ l (aipp, <pv)dP. 

Thus, q- l {a(pv, <pp) t is measurable on 9l{, and has the same integrals 
over 3lo sets as dP /dP; thus, it is a version of E { (dP /dP) \ Ul f } . 

Lemma 2: ipv t is a yi' -mar ting ale under P . 

Proof: y. is a (Brownian) martingale under P ; since 9lo Q %, then 

f ytdP = j y.dPo if A E 3I S and s < t. 

For A E 3Io» eq. (2) implies that there exists B E C, with A 
= {Ty E B}. Since y ~ <pv in law, there follows 

f y t dP = f y l q- 1 (z,y)= f ytE{<r l (z,y)t\%\dP 

J A JTv^B J Ty&B 

= f ytq~ 1 (2,y)tdP = I {ipv) t q- l {a<pv, <pv) t dP 

= f MtE I ^° I 31!, J dP = j A MtdPo. 
Similarly, for s < t and A E ^o> 

f y.dP = j^ (<pv),dP . 
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Hence, A £ 31q implies f A [<pv t — pp.JdPo = 0, and so ^v. is a 
(2do> Po) -martingale, being adapted to 91,}. 

Lemma 3: Eo[yt\fJlo} = <pv t a.s. 

Proof: This is shown in the same way as Lemmas 1 and 2, by inte- 
grating over A £ 3lo, and using the adaptedness of v to y, and the 
property Ttpv = v a.s. 

Lemma 4' <pv. is a Brownian motion under (P , 31q). 
Proof: Let A 8 = i e — a(s, <pv) 



yt — <pvt = I A,ds 
Jo 

(y t - <?v t y = 2 r a. r a u 

yo /o 



duds. 



However, since y, and ^j/. are Po-martingales on ^o and 31q, respec- 
tively, the change of variables formula applied to each separately 
gives 

y\ = 2 fydy. + t (20) 

Jo 



ipvf = 2 I <pv s dipp a + <¥"')«• (21) 



Also 



2/« 



^ = y 2 t — yt I Aids. 

Jo 



The right-hand side is a product of semi-martingales on 'ifo, and 
change of variables gives 

y t ipv t = 21 y,dy, + t — I A,dy. — / y.A.ds. 
Jo Jo Jo 

Using eqs. (20) and (21), we find that 

(yt — <pv t ) 2 = (<pv) t - t + 2 [ J Aduds 

Jo J o 

— 2 / \y, — / Adu — <pv. dy s 

{<pv) t = t. 

Thus, <pp. is a continuous martingale with quadratic variation t, and 
so a Brownian motion, on (P , 916). 

Lemma 5: E (y t — <pv t ) 2 = 0. 

Proof: The processes y. and <pv. are Brownian under Po with respect 
to ^o and 3lo, respectively, so E yf = E <pv? = t. Thus, by Lemma 3 
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E (y t - <pv t ) 2 = 2(t - E y t <pv t ) 

= 2(t - EoE*[y t \%\tpv t ) 

= 2(t - E(<pv) t ) 
= 0. 

The indicated expectation exists because both y. and <pv. are Brownian 
under P , on respective algebras ^6 and OIq- This lemma shows that 
y and ipv are modifications of each other under P (and so under P), 
and completes the proof of Theorem 5. 

XII. DIRECT PROOF OF THEOREM 5 

It is possible to give a short proof of Theorem 5 not depending on 
the auxiliary measure Po or the representation for i given in Section 
VIII. This proof depends only on eqs. (18) and (19), the causality of <p, 
and the fact that y ~ ipv in law; otherwise, it is just an exercise in 
integration. 

By hypothesis there exists a causal functional <p, such that T<px = x 
for almost all x with respect to Wiener measure. Thus, the process 
(<pv) t = <p{t, v) (defined on the same probability space as y, and v) 
is identical in law to y. such that T<pv = v with probability one. Let /3 
be a causal functional such that E \ fi(t, y) | < oo for each t. The next 
step is to prove that 

E{p{t, y)\v„0 ^s ^t} = 0(t, <pv), a.s. 

Let then A E o{ v„ ^ s ^ t}. A has the form {w. v G B] with B a 
Borel set of C[0, Q, so by the integral equation it differs from 
{« : y G T~*B\ by at most a null set. Then, since <pv and y are identical 
in law, and <? C T~ l on a set of Wiener measure one, we find 

f P{t, y)dP = f 0(t, y)dP = f 0{t, <pv)dP 

= f 0(t, <pv)dP = f f3(t, <pv)dP 

= j Pit, <pv)dp. 

Thus, 0(t, ipv) has the same integrals as 0(t, y) over sets defined by v. 
over [0, f\, and (since <p is causal) is measurable on a{v„ ^ s ^ t). 
Hence, it is a version of E{f3(t, y)\v„0 ^ s ^ t}. To complete the 
proof, let @(t, x) = a + (t, x)*, where a + = max {0, a}, to find, since 
y ~ ipv in law, that 

E\a+(t,y)> - a+(t, <pvW 

= 2Ea+(t, y) - 2Ea+(t, <pp)*E{a+(t, y)»| *., ^ s ^ t) = 0. 
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Similarly, E\cr{t, y)* - a~(t, *>v)*| 2 = 0. Hence, 

a ('>y) = <*(•> <P V ) a - s - ^ X P (X = Lebesgue measure), 
so that for each t, by the integral equations, 



Vt 



— ((pv) t = / [a(s, y) — a(s, <pv)]ds = a.s. 
Jo 



Since <p is causal, it follows that y t is equal almost surely to a function 
measurable on a { v„ 8 ^ t } . Since this is true for each t, it follows that 
for each t the algebras <r{v„ s S t] and <r{y„ s ^ t\ are equal (mod P). 

Remark 1 : Since JV &ds < <» a.s., then if also 

a(s, <pv) 2 ds < oo a.s., 



i: 



a theorem of Kailath and Zakai will imply that the respective measures 
induced by y and <pv are each absolutely continuous with respect to 
Wiener measure with the same Radon-Nikodym derivative q[a{x), x~]. 
Hence, y ~ <pp in law, as desired for the hypothesis of Theorem 5. 
Thus, the condition that y and <pv induce the same measure is easily 
met, in comparison with the difficulty of finding a causal solution. 

Remark 2 : The condition in Theorem 5 that £ be causal can be replaced, 
if we are content to work only over a finite interval [0, T~], by the 
conditions that £ be a strong solution over [0, T2 in the sense of Ref. 
16, and that it be nonanticipative. For it has been remarked by 
Yershov 6 that a strong (over [0, TJ) nonanticipative solution is 
necessarily causal. 
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